Prompts
layered_prompt renders a deterministic system prompt from explicit layers and returns a LayeredPrompt value with a SHA-256 cache key derived from the rendered text.
layered_prompt renders a deterministic system prompt from explicit layers and returns a
LayeredPrompt value with a SHA-256 cache key derived from the rendered text.
from flowai_harness import define_tool, layered_prompt
domain_knowledge = {
"entities": ["product", "segment", "channel"],
"metrics": ["revenue", "margin"],
}
@define_tool(
name="search_products",
description="Search products by query.",
input_schema={"query": str, "limit": int},
approval="never",
)
async def search_products(args, ctx):
...
coordinator_prompt = layered_prompt(
identity="You coordinate scenario planning for Acme customers.",
communication="Be concise and surface approval points explicitly.",
operational_rules=[
"Route plan-building work to the planner.",
"Route approved materialization work to the executor.",
],
tools=[search_products],
domain_knowledge=domain_knowledge,
safety=["Never execute side-effecting tools without approval."],
output_format={"events": ["plan_proposed", "approval_required", "tool_result"]},
examples=[{"user": "Branch Q3 pricing into a 5% promotion for top EU SKUs."}],
)Why layering
- Identity is per-agent.
- Domain knowledge is shared structured prompt content, owned by customer/domain code.
- Operational rules are behavioral instructions separate from domain facts.
- Tools are auto-derived from tool specs and rendered consistently.
planner = define_planner(
name="scenario_planner",
model="claude-sonnet-4-6",
plan=scenario_plan,
prompt=layered_prompt(
identity="You produce typed scenario plans.",
domain_knowledge=domain_knowledge,
),
)
executor = define_executor(
name="scenario_executor",
model="claude-sonnet-4-6",
plan=scenario_plan,
tools=[search_products],
prompt=layered_prompt(
identity="You execute approved scenario plans action by action.",
operational_rules=["Execute only approved plans."],
domain_knowledge=domain_knowledge,
tools=[search_products],
),
)Section order
layered_prompt always renders sections in this fixed order, omitting empty sections:
- Identity — who the agent is. The only required section.
- Communication — tone, format, conventions.
- Operational Rules — bullet list of behavioral rules.
- Tools — auto-derived Markdown table of tool name, description, and approval policy.
- Domain Knowledge — any structured value, rendered as JSON.
- Safety — bullet list of guardrails.
- Output Format — structured description of expected output.
- Examples — structured examples.
Tenant identity is not rendered automatically. If a prompt needs domain facts, pass them through
domain_knowledge; if it needs operating constraints, pass them through operational_rules.
When define_runtime(...) assembles agents, it also merges runtime-visible tools into the same
# Tools section:
- explicit
layered_prompt(..., tools=...)rows; - agent-bound Python tools passed as
define_coordinator(...),define_planner(...),define_executor(...), ordefine_specialist(...)withtools=[...]; - enabled built-in toolkit tools from
toolkits=[...], including narrowed toolkit configs such asToolkitSpec(id="catalog", config={"tools": ["execute_query"]}).
Duplicate tool names render once. Explicit prompt rows win so customer-authored descriptions are not overwritten by generated toolkit descriptions.
Accepted section types
| Section | Type |
|---|---|
identity | str (required) |
communication, operational_rules, safety | str or sequence of strings (rendered as bullet list) |
tools | iterable of ToolSpec, str, or mapping |
domain_knowledge, output_format, examples | any JSON-serializable value, Pydantic model, or string |
Structured sections serialize with sorted keys, so dict ordering does not affect the cache key.
How a prompt reaches the model
Prompt construction is split into two different concerns:
- Python owns the ergonomic prompt authoring API and validates the runtime spec.
- Rust owns runtime assembly, agent orchestration, framework primitives, tool dispatch, and provider calls.
The system prompt text itself is not rebuilt at every layer. It is rendered
once by Python, copied into the native runtime spec, copied into the framework
agent registration, extracted into a ChatProgram, and finally passed to Rig
as the provider preamble:
Python: layered_prompt(...) renders the prompt text
-> define_coordinator / define_planner / define_executor / define_specialist
-> AgentSpec.system_prompt
-> define_runtime(...) -> RuntimeSpec
-> create_runtime(...) serializes RuntimeSpec as camelCase JSON
-> PyO3: _internal.create_runtime parses flowai_runtime::RuntimeSpec
-> flowai-runtime: AgentSpec::to_registration
-> agent-fw-agent: AgentRegistration.system_prompt
-> Runtime::query -> AgentOrchestrator::invoke
-> agent-fw-agent: ChatMessage::system + parse_conversation
-> agent-fw-agent: ChatProgram.system_prompt
-> agent-fw-interpreter: RigAnthropicChatInterpreter::interpret
-> Rig: AgentBuilder.preamble(system_prompt)
-> LLM requestPython: agent definition and runtime spec
Agent constructors live in the harness module flowai_harness.agents. Each
define_* function accepts either a plain str or a LayeredPrompt and
normalizes the prompt with the same helper:
def _prompt_text_and_cache_key(prompt):
if isinstance(prompt, LayeredPrompt):
return prompt.text, prompt.cache_key
if isinstance(prompt, str):
return prompt, None
raise TypeError("prompt must be a string or LayeredPrompt")The normalized text is stored as AgentSpec.system_prompt — at this point the
system prompt is just a field on a frozen Pydantic model. There is no provider,
no tool dispatcher, and no LLM call. One subtle point: prompt_cache_key is
retained on the Python AgentSpec, but it is excluded from the native runtime
wire shape. The Rust runtime receives the system prompt text, not the Python
cache key.
define_runtime(...) collects agents, references, plans, toolkits, approval
policies, storage descriptors, providers, and tool bindings into a pure
RuntimeSpec. When create_runtime(...) is called, Python serializes that
spec to camelCase JSON:
json.dumps(runtime_spec.model_dump(by_alias=True, mode="json"))The wire JSON contains each agent's systemPrompt field, along with the
role's stateful default. The optional maxTurns field is omitted when
unset:
{
"agents": [
{
"name": "scenario_planner",
"role": "planner",
"stateful": true,
"model": {"id": "claude-sonnet-4-6", "provider": null},
"systemPrompt": "# Identity\nYou produce typed scenario plans.",
"routes": [],
"toolkits": []
}
]
}create_runtime(...) also passes host tool callback maps, approval predicates,
event hooks, interpreter selection, and data environment config. Those are
runtime dependencies; they do not rewrite systemPrompt.
The PyO3 bridge
The private PyO3 module is the harness extension crate (py-flowai-harness's
src/lib.rs). Its create_runtime(...) function parses the JSON into
flowai_runtime::RuntimeSpec, builds RuntimeDeps, attaches Python callback
tools as Rust ToolHandlers, selects an interpreter, and calls
Runtime::new(spec, deps). The PyO3 layer is a boundary adapter: it does not
compose prompt text. Its job is to move the already-rendered systemPrompt
from Python JSON into Rust types, then attach effectful dependencies around
that pure spec.
Rust runtime assembly
The native runtime spec is defined in the crates/flowai-runtime crate root.
Its AgentSpec has:
pub struct AgentSpec {
pub name: String,
pub role: AgentRole,
pub stateful: bool,
pub model: ModelSpec,
pub system_prompt: String,
pub routes: Vec<String>,
pub toolkits: Vec<String>,
pub max_turns: Option<u32>,
}Runtime assembly converts each harness AgentSpec into the generic framework
AgentRegistration:
fn to_registration(&self) -> AgentRegistration {
AgentRegistration {
name: self.name.clone(),
model: ModelId::new(self.model.id.clone()),
system_prompt: self.system_prompt.clone(),
role: Some(self.role.to_agent_label()),
stateful: self.stateful,
}
}This is where the harness stops owning prompt semantics. From here on, the generic framework sees an agent name, model, optional role label, statefulness flag, and system prompt string.
Request-time orchestration
Runtime::query(...) and Runtime::run_specialist(...) are implemented in
crates/flowai-runtime's runtime::query module. For a normal query, the
runtime finds the registered coordinator, builds a per-request
AgentOrchestrator, and calls
orchestrator.invoke(SubAgentRequest::new(entry_agent, prompt)). The
per-request orchestrator wires request-scoped dependencies — a
ChannelEventSink for streaming events back to Python, per-agent dispatchers
for built-in toolkits and host tools, approval, cancellation, tracing, KV,
catalog, target database, and sub-agent invocation extensions, and per-agent
interpreters selected from the runtime provider config. None of that changes
the system prompt; it determines the execution environment in which the prompt
will run.
AgentOrchestrator::invoke(...) lives in crates/agent-fw-agent's
orchestrator module. For each agent call, it builds framework conversation
values. Stateful agents also load their persisted non-system message history
before the current prompt (error mapping elided here):
let mut messages = Vec::new();
messages.push(ChatMessage::system(®istration.system_prompt));
if registration.stateful {
let history = self.memory.load(&agent_tenant, ®istration.name).await?;
messages.extend(history);
}
messages.push(ChatMessage::user(&request.prompt));
let conversation = parse_conversation(messages)?;
let program = ChatProgram::new(
conversation,
registration.model.clone(),
agent_tenant.clone(),
);The key framework types are in crates/agent-fw-agent's conversation
module:
SystemPrompt(String): a thin typed wrapper around the system prompt text.Conversation: validated messages plus the current user prompt.ChatProgram: a pure value containingConversation,SystemPrompt,ModelId, andTenantContext.
parse_conversation(...) extracts the first system message into
Conversation.system. ChatProgram::new(...) then copies that value into
ChatProgram.system_prompt. System messages are never included in
Conversation.history(), and stored memory never contains system messages —
the system prompt always comes from the agent registration. This gives the
framework a clean split:
conversation().prompt()is the current user task.conversation().history()is non-system conversational history.system_prompt()is the agent's behavioral instruction.
Interpreter to LLM
The framework interpreter trait is in crates/agent-fw-agent's interpreter
module:
pub trait ChatInterpreter: Send + Sync {
fn interpret(
&self,
program: ChatProgram,
cancel: CancellationToken,
) -> Pin<Box<dyn Stream<Item = StreamPart> + Send>>;
}The shipped Anthropic implementation is in crates/agent-fw-interpreter's
rig_chat module. It extracts the pieces from ChatProgram:
let prompt = program.conversation().prompt().as_str().to_string();
let history = conversation_to_rig_history(program.conversation());
let system_prompt = program.system_prompt().as_str().to_string();
let model = program.model().as_str().to_string();Then the provider-specific runner gives Rig the system prompt as the agent preamble:
let mut builder = AgentBuilder::new(completion_model)
.preamble(&system_prompt)
.default_max_turns(max_turns);
let agent = builder.tools(dispatcher_rig_tools(dispatcher)).build();
agent.stream_chat(&prompt, history).with_hook(hook).awaitFor Anthropic, OpenAI-compatible, and Bedrock Rig paths, the same conceptual mapping applies:
- System prompt -> Rig
preamble(...). - Current user prompt ->
stream_chat(&prompt, history). - Prior non-system messages -> Rig chat history.
- Tools -> Rig tool definitions from the dispatcher, not string concatenation.
The provider client is the first place an LLM request can happen. Everything
before ChatInterpreter::interpret(...) is spec construction, runtime
assembly, or pure framework value construction. The framework deliberately
keeps this boring: the system prompt is immutable configuration for an agent
invocation. The text is rendered once in Python, then copied unchanged from
RuntimeSpec to AgentRegistration to ChatProgram until the interpreter
hands it to the provider as the preamble. Prompt text is data until the
interpreter consumes the ChatProgram; all effects sit at the interpreter and
tool-dispatch layers.
Tool descriptions versus executable tools
There are two tool surfaces that are easy to confuse:
| Surface | Where it appears | Purpose |
|---|---|---|
| Prompt tool table | layered_prompt(tools=[...]) | Human-readable instructions inside the system prompt. |
| Runtime tool binding | define_*agent*(..., tools=[...]), define_runtime(...), toolkits | Executable tool schema and handler registration. |
Including a tool in the prompt table does not register it with the runtime. Registering a tool with the runtime does not automatically insert a tool table into your prompt. In typical agents you do both: register the tool for execution, and include a concise prompt description when the agent needs policy or domain guidance about when to use it.
See also
layered_promptreference- Debugging System Prompts — find the stage where a prompt went missing or stale.
- Tenant — runtime identity is separate from prompt content.
- Tools — the
toolsargument auto-derives the Markdown table.
Tools
Tools are async Python handlers wrapped in a ToolSpec. The @define_tool(...) decorator builds the spec and binds the handler in one step.
Runtime
define_runtime(...) produces a validated RuntimeSpec; create_runtime(...) hands the spec to the Rust runtime (flowai-runtime) and returns a native Runtime handle.
