核心 agent 循环

本章讲整个 SDK 的心脏:Runner.run() 背后那个 while 循环。读懂「一个回合做什么」和「五种下一步如何决策」,你就掌握了这个库的本质。

1. 它要解决的小问题

LLM 一次调用只会返回一坨输出——可能是一段文字,也可能是「我想调 get_weather('北京')」这样的工具调用请求。但 LLM 自己不会真的去执行工具,也不会自动把结果拿回来接着想。

所以需要一个外层循环替它跑腿:调模型 → 看它想干嘛 → 真的去执行 → 把执行结果塞回对话 → 再调模型,反复直到它给出最终答案。这个循环就是 agent 框架的全部价值所在。

2. 思路 / 直觉:把「下一步」收敛成四个类型

SDK 最关键的设计决策:每个回合结束后,把「接下来该干嘛」归类成恰好四种之一。这四个类型定义在 run_steps.py:154-174:

下一步类型	含义	循环怎么响应
`NextStepFinalOutput`	模型给出了最终答案	跑输出护栏,返回 `RunResult`,结束
`NextStepHandoff`	模型要把任务交给另一个 agent	切换 `current_agent`,继续循环
`NextStepRunAgain`	跑完了工具,需要把结果喂回模型	直接继续循环
`NextStepInterruption`	有工具调用在等人类审批	打包成「待恢复」状态,提前返回

把无穷无尽的可能性压缩成这四种,循环主体就变得极其清爽:拿到 next_step,match 一下,要么 continue 要么 return。

3. 一个回合做什么(run_single_turn)

一个回合分三步,全在 run_single_turn(run_loop.py:1708)里:

  run_single_turn:
准备  → 跑 on_agent_start 钩子;拼 system_prompt、handoffs 列表、工具列表、输入历史
调模型 → get_new_response(...)  得到 ModelResponse
解析  → get_single_step_result_from_response(...)  →  SingleStepResult(带 next_step)

第 1 步——准备。 注意它用 asyncio.gather 并发拿 system prompt 和动态 prompt 配置:

# run_loop.py:1751 run_single_turn —— 并发取指令与 prompt 配置
system_prompt, prompt_config = await asyncio.gather(
    execution_agent.get_system_prompt(context_wrapper),
    execution_agent.get_prompt(context_wrapper),
)

get_system_prompt(agent.py:938)允许 instructions 是字符串、也可以是 (context, agent) -> str 的回调(动态系统提示)。

第 2 步——调模型。 get_new_response(run_loop.py:1798)做了几件聪明事:先跑 maybe_filter_model_input(允许用户改写发给模型的输入),然后对输入做去重 deduplicate_input_items_preferring_latest(run_loop.py:1826),再 get_model(...) 拿到具体模型实现去调。模型这层被 Model 接口(models/interface.py:37)完全抽象掉——这就是「provider-agnostic」的来源。

第 3 步——解析。 把原始 ModelResponse 交给 process_model_response 分拣,再交给 execute_tools_and_side_effects 执行并算出 next_step。下面两节细讲这俩。

4. 分拣:process_model_response

模型的输出是一个 output 列表,里面混着各种类型的 item:文本消息、函数调用、计算机操作、shell 调用、MCP 审批请求……process_model_response(turn_resolution.py:1551)的工作就是遍历这个列表,按类型分门别类塞进不同的桶,产出一个 ProcessedResponse(run_steps.py:115)。

它先建好几张查找表,再循环分拣:

# turn_resolution.py:1573 process_model_response —— 先建查找表
handoff_map = {handoff.tool_name: handoff for handoff in handoffs}
function_map = build_function_tool_lookup_map(
    [tool for tool in all_tools if isinstance(tool, FunctionTool)]
)
computer_tool = next((t for t in all_tools if isinstance(t, ComputerTool)), None)
# ... 然后 for output in response.output: 按 output_type 分拣进各个 list

ProcessedResponse 里的桶包括:handoffs、functions、computer_actions、shell_calls、apply_patch_calls、mcp_approval_requests、function_tools_not_found 等(run_steps.py:116-130)。它还有个关键方法 has_tools_or_approvals_to_run()(run_steps.py:132)——只要任意一个桶非空,就说明这回合「有本地活要干」。

坑点: 如果模型产出了一个 shell_call 但 agent 根本没配 shell 工具,这里会直接抛 ModelBehaviorError(turn_resolution.py:1635)——框架把「模型瞎调不存在的工具」当成模型行为错误处理。

5. 决策:execute_tools_and_side_effects(精华)

这是整个循环最核心的函数(turn_resolution.py:629)。它先把所有要跑的工具跑掉(_execute_tool_plan,内部并发执行函数工具、计算机操作等),拿到结果后按一个明确的优先级决定 next_step。这个优先级顺序就是 agent 的行为语义,值得逐条看:

优先级 1 —— 有审批中断? 任何工具需要人类审批(见 04 章),立即返回 NextStepInterruption,中止后续判断:

# turn_resolution.py:713 —— 中断优先级最高
if interruptions:
    return SingleStepResult(
        ...,
        next_step=NextStepInterruption(interruptions=interruptions),
        ...,
    )

优先级 2 —— 有交接? 如果模型调了某个 handoff 工具,执行交接逻辑(可能多个 handoff 时只取第一个),返回 NextStepHandoff:

# turn_resolution.py:732 —— 交接
if run_handoffs := processed_response.handoffs:
    return await execute_handoffs_call(public_agent=public_agent, ..., run_handoffs=run_handoffs, ...)

优先级 3 —— 工具结果直接当最终输出? 取决于 agent 的 tool_use_behavior。这条逻辑在 check_for_final_output_from_tools(turn_resolution.py:594):

`tool_use_behavior` 的值	行为
`"run_llm_again"`(默认)	工具结果喂回模型,不当最终输出
`"stop_on_first_tool"`	第一个工具的输出直接当最终结果,不再问模型
`StopAtTools(stop_at_tool_names=[...])`	命中名单里的工具就停,用其输出当结果
一个回调函数	你自己决定是否 final、final 是什么

# turn_resolution.py:605 check_for_final_output_from_tools —— stop_on_first_tool 分支
elif agent.tool_use_behavior == "stop_on_first_tool":
    return ToolsToFinalOutputResult(is_final_output=True, final_output=tool_results[0].output)

优先级 4 —— 模型给了文本/结构化输出且没有别的活要干? 走最终输出。这里有三种子情况(turn_resolution.py:769-835):模型拒答(refusal,抛 ModelRefusalError 或交给 error handler)、有 output_type 则把文本按 schema 校验成结构化对象、纯文本则直接当最终答案。

优先级 5 —— 兜底:再跑一圈。 以上都不满足(典型:跑了工具、tool_use_behavior 是默认值),返回 NextStepRunAgain,把工具结果喂回模型继续:

# turn_resolution.py:837 —— 兜底
return SingleStepResult(
    ...,
    next_step=NextStepRunAgain(),
    ...,
)

6. 外层 while:把回合串起来

回到 AgentRunner.run(run.py:450)。剥掉会话/追踪/sandbox 的外壳,主循环骨架是这样(run.py:767 起):

# 示意,提炼自 run.py:767 起的 while 循环
while True:
    if current_turn == 0:                     # 只在第一回合跑输入护栏(且只跑首个 agent 的)
        run_input_guardrails(...)
    current_turn += 1
    if max_turns is not None and current_turn > max_turns:
        raise MaxTurnsExceeded(...)           # 防止无限循环的硬上限

    turn_result = await run_single_turn(...)  # 跑一个回合

    if isinstance(turn_result.next_step, NextStepInterruption):
        return _finalize_result(build_interruption_result(...))   # 待审批,提前返回
    if isinstance(turn_result.next_step, NextStepFinalOutput):
        run_output_guardrails(...)            # 出最终答案前跑输出护栏
        return _finalize_result(RunResult(...))
    if isinstance(turn_result.next_step, NextStepHandoff):
        current_agent = turn_result.next_step.new_agent   # 换 agent
        continue
    # NextStepRunAgain → 直接 continue

真实代码里这段被会话持久化、server 端对话追踪、中断恢复(resolve_interrupted_turn,run.py:849)等逻辑撑得很长,但逻辑骨架就是上面这几行。两个细节值得记:

输入护栏只在第 0 回合、且只对起始 agent 跑(run.py:770-772)——交接后的新 agent 不会重跑输入护栏。
max_turns(默认见 run_config.DEFAULT_MAX_TURNS)是防止 agent 死循环的硬刹车(run.py:1058),超了抛 MaxTurnsExceeded(除非配了 error handler 兜底)。

7. 流式版本(run_streamed)

Runner.run_streamed(run.py:1647 / run_streamed 工厂在 run.py:365)返回一个 RunResultStreaming,你用 async for event in result.stream_events() 实时拿事件(result.py:696)。事件分三类(stream_events.py:61):

RawResponsesStreamEvent —— 模型的原始 token 流(逐字)。
RunItemStreamEvent —— 高层语义事件(「生成了一条消息」「调了一个工具」「工具返回了」)。
AgentUpdatedStreamEvent —— 发生了交接,当前 agent 变了。

底层循环和非流式共用同一套「下一步」决策,只是通过一个队列把中间产物实时推给消费者(start_streaming,run_loop.py:440)。

8. 巧妙之处

「下一步」类型把控制流变成数据。 不是用一堆嵌套 if 在循环里直接跳转,而是让 run_single_turn 返回一个值(SingleStepResult.next_step),循环只负责 match 这个值。这让「一个回合」可被独立测试、可被流式/非流式复用、可被序列化进 RunState 以支持中断恢复(run_steps.py:177 的 SingleStepResult)。
整个 run 可序列化成 RunState。 RunResult.to_state()(result.py:393)能把当前进度(含待审批的工具)存成 RunState,之后把 RunState 当 input 再传给 Runner.run 即可从断点续跑(run.py:469 的 is_resumed_state 分支)。这是人在环路的基础。
provider 无关。 循环从不直接 import 任何具体模型;只通过 Model 抽象(models/interface.py:37)调 get_response / stream_response。换模型 = 换一个 ModelProvider。

9. 代码地图

主题	文件路径	符号名
while 主循环	`src/agents/run.py`	`AgentRunner.run`
一个回合	`src/agents/run_internal/run_loop.py`	`run_single_turn`、`get_new_response`
响应分拣	`src/agents/run_internal/turn_resolution.py`	`process_model_response`、`ProcessedResponse`
执行+决策	`src/agents/run_internal/turn_resolution.py`	`execute_tools_and_side_effects`、`check_for_final_output_from_tools`
下一步类型	`src/agents/run_internal/run_steps.py`	`NextStepFinalOutput`、`NextStepHandoff`、`NextStepRunAgain`、`NextStepInterruption`、`SingleStepResult`
流式	`src/agents/run_internal/run_loop.py`	`start_streaming`、`run_single_turn_streamed`
流事件类型	`src/agents/stream_events.py`	`RawResponsesStreamEvent`、`RunItemStreamEvent`、`AgentUpdatedStreamEvent`
续跑入口	`src/agents/run.py`	`resolve_interrupted_turn`

1. 它要解决的小问题​

2. 思路 / 直觉:把「下一步」收敛成四个类型​

3. 一个回合做什么(run_single_turn)​

4. 分拣:process_model_response​

5. 决策:execute_tools_and_side_effects(精华)​

6. 外层 while:把回合串起来​

7. 流式版本(run_streamed)​

8. 巧妙之处​

9. 代码地图​