第 5 章 · 模型抽象与推理

这章讲两件事:Agno 怎么用一个基类罩住 30+ 个模型 provider(让上层循环不关心你用的是 OpenAI 还是 Anthropic),以及 它的「推理(reasoning)」如何工作——原生推理模型和默认 CoT 两条路。

5.1 模型抽象:一个基类罩住所有 provider

models/ 下有几十个 provider 目录(openai、anthropic、google、groq、mistral、ollama、bedrock…)。它们都继承同一个 Model 基类(models/base.py)。基类定义了少数几个「provider 必须实现」的抽象方法:

抽象方法	干什么	位置
`invoke`	同步调一次底层 API,返回原始响应	`models/base.py:552`
`ainvoke`	异步版	`:556`
`invoke_stream` / `ainvoke_stream`	流式版	`:560`、`:564`

基类则用这些原语实现了通用部分:response(同步工具循环,:650)、aresponse(:881)、response_stream(:1362)、aresponse_stream(:1641),以及工具执行 run_function_calls(:2311)、结果格式化等。

用图说清「谁实现什么」:

        Model 基类  (models/base.py)
  ┌──────────────────────────────────────────┐
  │ 通用(基类已实现,只写一份):                │
  │   response / aresponse / *_stream  ← 工具循环 │
  │   run_function_calls               ← 执行工具 │
  │   format_function_call_results               │
  ├──────────────────────────────────────────┤
  │ 抽象(每个 provider 子类各实现):            │
  │   invoke / ainvoke                 ← 调底层API │
  │   _process_model_response          ← 解析响应  │
  │   _format_tools                    ← 工具格式化 │
  └──────────────────────────────────────────┘
        △                  △                 △
   OpenAIChat         Claude            Gemini  …(30+)

为什么这么分: 各家 API 的请求/响应 JSON、工具描述格式都不同,这些差异被压进子类的 invoke / _process_model_response / _format_tools;而「反复调工具直到收工」的控制流(第 1 章那个 while True)只写一遍在基类里。新增一个 provider,基本只要实现「怎么调它的 API、怎么解析它的响应」。

5.2 Fallback:主模型挂了换备用

Agent 可配 fallback_models(agent/agent.py:74)和 fallback_config(:76)。第 1 章里 _run 调的不是裸 model.response,而是 call_model_with_fallback(agent.model, agent.fallback_config, ...)(_run.py:531)——主模型抛错时按配置切到备用模型(逻辑在 models/fallback.py)。fallback_config 还支持「按错误类型路由」(agent.py:76 注释:advanced error-specific routing)。

5.3 推理:两条路

Agno 的「推理」由 ReasoningManager(reasoning/manager.py)统管。第 1 章提过,_run 在调主模型前会先 handle_reasoning(agent/_response.py:74),它内部调 reason(...)。推理分两条路:

路 A:原生推理模型(native)

有些模型自带「思考」能力(DeepSeek-R1、Claude with thinking、OpenAI o 系列、Gemini thinking 等)。ReasoningManager._detect_model_type(reasoning/manager.py:123)逐个判断模型属于哪种,再走对应的提取函数:

# reasoning/manager.py:209 —— 按模型类型取原生推理(节选)
if model_type == "deepseek":
    reasoning_message = get_deepseek_reasoning(reasoning_agent, messages, ...)
elif model_type == "anthropic":
    reasoning_message = get_anthropic_reasoning(reasoning_agent, messages, ...)
elif model_type == "openai":
    reasoning_message = get_openai_reasoning(reasoning_agent, messages, ...)
# ... groq / ollama / ai_foundry / gemini / vertexai

每种 provider 的「思考内容」放在响应的不同字段(reasoning_content / thinking blocks…),所以每家有自己的提取器(reasoning/anthropic.py、reasoning/deepseek.py 等)。原生路返回的 ReasoningResult 把思考包成一个 ReasoningStep(reasoning/manager.py:267-271)。

路 B:默认 CoT 推理 agent(default)

模型若不是原生推理模型,Agno 用一个**独立的「推理 agent」**做显式的多步思考(chain-of-thought)。_get_default_reasoning_agent(reasoning/manager.py:167)创建它,带 min_steps / max_steps 约束(:176-177)。

这个推理 agent 的输出是结构化的 ReasoningSteps——每步一个 ReasoningStep,带一个 NextAction(reasoning/step.py 的 NextAction、ReasoningStep、ReasoningSteps,在 manager.py:30 import)。(inferred) 默认路的循环是「反复让推理 agent 产出下一步,直到它给出 NextAction=final 或到 max_steps」——这从 min_steps/max_steps 参数和 NextAction 类型推断,默认 agent 的循环体在 reasoning/default.py,本次未逐行读。

两条路的选择

  开了 reasoning?
     │
     ▼
  is_native_reasoning_model(model)?    (manager.py:187)
     ├─ 是 → 路 A:直接用模型自带的思考(各 provider 提取器)
     └─ 否 → 路 B:跑一个独立 CoT 推理 agent,产出 ReasoningStep 序列
              │
              ▼
     把推理结果并入 run_messages,再进第 1 章的主模型调用

两条路的产物都汇成 ReasoningStep,最终通过 format_reasoning_step_content(agent/_response.py:143)注入到给主模型的消息里,让最终回答「带着想清楚的思路」。

5.4 reasoning vs reasoning_model 两个开关

reasoning: bool(agent/agent.py:199)—— 开启推理,用 agent 自己的 model 走默认 CoT。
reasoning_model: Optional[Model](:200)—— 指定专门的推理模型(可以和回答模型不同,比如用 R1 思考、用 GPT-4o 回答)。handle_reasoning 的触发条件就是这两者任一为真(_response.py:77:if agent.reasoning or agent.reasoning_model is not None)。

5.5 巧妙之处

provider 差异收敛到 4 个抽象方法。 上层完全不知道你用哪家(models/base.py:552 起的抽象 + 基类通用循环)。
推理与回答可用不同模型。 reasoning_model 独立于 model(agent.py:200),能「贵模型思考、便宜模型回答」或反之。
原生 vs 显式 CoT 统一成 ReasoningStep。 不管思考来自模型内部还是外部 agent,下游都按同一种结构处理(reasoning/step.py)。
fallback 是包在 response 外的薄层。 call_model_with_fallback(_run.py:531)不改动 Model 循环,只在抛错时换模型重试。

5.6 边界

原生推理只对已识别的 provider 生效;is_native_reasoning_model 在 model is None 时返回 False、否则返回 _detect_model_type(model) is not None(manager.py:190-191),即 _detect_model_type 返回 None(manager.py:150)就当普通模型走默认路。
默认 CoT 推理会额外消耗 token 和时延(多跑一个 agent)——max_steps 是它的安全阀。

5.7 代码地图

主题	文件	符号
模型基类	`libs/agno/agno/models/base.py`	`Model`、`invoke`、`response`、`run_function_calls`
provider 适配	`libs/agno/agno/models/openai/chat.py` 等	`OpenAIChat`(及各 provider)
fallback	`libs/agno/agno/models/fallback.py`	`FallbackConfig`、`call_model_with_fallback`
推理总管	`libs/agno/agno/reasoning/manager.py`	`ReasoningManager`、`_detect_model_type`、`is_native_reasoning_model`
推理触发	`libs/agno/agno/agent/_response.py`	`handle_reasoning`、`format_reasoning_step_content`
推理数据	`libs/agno/agno/reasoning/step.py`	`ReasoningStep`、`ReasoningSteps`、`NextAction`
各 provider 提取器	`libs/agno/agno/reasoning/`	`anthropic.py`、`deepseek.py`、`openai.py`…

5.1 模型抽象:一个基类罩住所有 provider​

5.2 Fallback:主模型挂了换备用​

5.3 推理:两条路​

路 A:原生推理模型(native)​

路 B:默认 CoT 推理 agent(default)​

两条路的选择​

5.4 reasoning vs reasoning_model 两个开关​

5.5 巧妙之处​

5.6 边界​

5.7 代码地图​