ai-robot-core/spec/ai-service/iterations/v0.8.0-intent-hybrid-routing/design.md

984 lines
35 KiB
Markdown
Raw Permalink Normal View History

---
feature_id: "AISVC"
iteration_id: "v0.8.0-intent-hybrid-routing"
title: "意图识别混合路由优化 - 技术设计"
status: "draft"
version: "0.8.0"
created_at: "2026-03-08"
inputs:
- "spec/ai-service/iterations/v0.8.0-intent-hybrid-routing/requirements.md"
- "spec/ai-service/iterations/v0.8.0-intent-hybrid-routing/scope.md"
---
# 意图识别混合路由优化 - Designv0.8.0
## 1. 设计目标与约束
### 1.1 设计目标
- 将意图识别从"单一规则匹配"升级为"规则+语义+LLM"三路混合路由
- 提升意图识别召回率与准确率
- 提供置信度评分与路由追踪日志
- 最小侵入:仅在 Step 3 插入混合路由,不改主链路
### 1.2 硬约束
- 现有规则引擎继续可用,作为混合路由的一路输入
- `/ai/chat` 对外响应语义不变
- 全部新增逻辑必须 tenantId 隔离
- 保留 `IntentRouter.match()` 方法向后兼容
---
## 2. 架构设计
### 2.1 最小侵入架构图
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Orchestrator 12-Step Pipeline │
│ │
│ Step 1: InputScanner → Step 2: FlowEngine → Step 3: IntentRouter [改造] │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ IntentRouter (Hybrid Routing) ││
│ │ ││
│ │ ┌─────────────────────────────────────────────────────────────────────┐││
│ │ │ Parallel Matching Layer │││
│ │ │ │││
│ │ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │││
│ │ │ │ RuleMatcher │ │SemanticMatcher│ │ LlmJudge │ │││
│ │ │ │ (现有+score) │ │ (新增) │ │ (条件触发) │ │││
│ │ │ │ │ │ │ │ │ │││
│ │ │ │ keywords │ │ embedding │ │ LLM call │ │││
│ │ │ │ regex │ │ similarity │ │ arbitration │ │││
│ │ │ │ score: 0|1 │ │ score: 0~1 │ │ score: 0~1 │ │││
│ │ │ └───────┬───────┘ └───────┬───────┘ └───────┬───────┘ │││
│ │ │ │ │ │ │││
│ │ │ └───────────────────┼───────────────────┘ │││
│ │ │ ▼ │││
│ │ └─────────────────────────────────────────────────────────────────────┘││
│ │ │ ││
│ │ ▼ ││
│ │ ┌─────────────────────────────────────────────────────────────────────┐││
│ │ │ FusionPolicy (新增) │││
│ │ │ │││
│ │ │ 输入: rule_result, semantic_result, llm_result │││
│ │ │ 处理: 加权融合 + 冲突检测 + 阈值判定 │││
│ │ │ 输出: final_intent, final_confidence, decision_reason, trace │││
│ │ │ │││
│ │ └─────────────────────────────────────────────────────────────────────┘││
│ │ ││
│ └─────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ response_type 路由(不变) │
│ fixed / rag / flow / transfer │
└─────────────────────────────────────────────────────────────────────────────┘
```
### 2.2 插入点标注
| 插入点 | 位置 | 改造内容 |
|--------|------|----------|
| Step 3 入口 | orchestrator.py:500 | 调用 `IntentRouter.match_hybrid()` 替代 `match()` |
| IntentRule 实体 | entities.py:420-463 | 新增 `intent_vector`、`semantic_examples` 字段 |
| IntentRouter 类 | intent/router.py | 新增 `match_hybrid()` 方法,保留 `match()` 向后兼容 |
| 新增模块 | intent/ | 新增 `semantic_matcher.py`、`llm_judge.py`、`fusion_policy.py`、`models.py` |
---
## 3. 核心接口设计
### 3.1 数据模型
```python
# 位置: app/services/intent/models.py
from dataclasses import dataclass, field
from typing import Any
import uuid
@dataclass
class RuleMatchResult:
"""规则匹配结果"""
rule_id: uuid.UUID | None
rule: "IntentRule | None"
match_type: str | None # "keyword" | "regex" | None
matched_text: str | None
score: float # 1.0 或 0.0
duration_ms: int
@dataclass
class SemanticCandidate:
"""语义匹配候选"""
rule: "IntentRule"
score: float # 0.0 ~ 1.0 相似度
@dataclass
class SemanticMatchResult:
"""语义匹配结果"""
candidates: list[SemanticCandidate] # Top-N 候选
top_score: float
duration_ms: int
skipped: bool # 是否跳过(无语义向量配置)
skip_reason: str | None # 跳过原因
@dataclass
class LlmJudgeInput:
"""LLM 仲裁输入"""
message: str
candidates: list[dict] # 候选意图列表
conflict_type: str # "rule_semantic_conflict" | "gray_zone" | "multi_intent"
@dataclass
class LlmJudgeResult:
"""LLM 仲裁结果"""
intent_id: str | None
intent_name: str | None
score: float # 0.0 ~ 1.0
reasoning: str | None # LLM 的推理过程
duration_ms: int
tokens_used: int
triggered: bool
@dataclass
class FusionConfig:
"""融合配置"""
w_rule: float = 0.5
w_semantic: float = 0.3
w_llm: float = 0.2
semantic_threshold: float = 0.7
conflict_threshold: float = 0.2
gray_zone_threshold: float = 0.6
min_trigger_threshold: float = 0.3
clarify_threshold: float = 0.4
multi_intent_threshold: float = 0.15
llm_judge_enabled: bool = True
semantic_matcher_enabled: bool = True
@dataclass
class RouteTrace:
"""路由追踪日志"""
rule_match: dict = field(default_factory=dict)
semantic_match: dict = field(default_factory=dict)
llm_judge: dict = field(default_factory=dict)
fusion: dict = field(default_factory=dict)
@dataclass
class FusionResult:
"""融合决策结果"""
final_intent: "IntentRule | None"
final_confidence: float
decision_reason: str
need_clarify: bool
clarify_candidates: list["IntentRule"] | None
trace: RouteTrace
```
### 3.2 RuleMatcher改造现有 IntentRouter
```python
# 位置: app/services/intent/router.py
class RuleMatcher:
"""规则匹配器(基于现有 IntentRouter"""
def match(self, message: str, rules: list[IntentRule]) -> RuleMatchResult:
"""
关键词+正则匹配
匹配算法:
1. 按 priority 降序遍历规则
2. 对每条规则,先尝试关键词匹配
3. 若无关键词匹配,尝试正则模式匹配
4. 返回第一个匹配(最高优先级)
Args:
message: 用户消息
rules: 规则列表(已按优先级降序排列)
Returns:
RuleMatchResult: 匹配结果
"""
start_time = time.time()
message_lower = message.lower()
for rule in rules:
if not rule.is_enabled:
continue
result = self._match_keywords(message, message_lower, rule)
if result:
duration_ms = int((time.time() - start_time) * 1000)
return RuleMatchResult(
rule_id=rule.id,
rule=rule,
match_type="keyword",
matched_text=result.matched,
score=1.0,
duration_ms=duration_ms
)
result = self._match_patterns(message, rule)
if result:
duration_ms = int((time.time() - start_time) * 1000)
return RuleMatchResult(
rule_id=rule.id,
rule=rule,
match_type="regex",
matched_text=result.matched,
score=1.0,
duration_ms=duration_ms
)
duration_ms = int((time.time() - start_time) * 1000)
return RuleMatchResult(
rule_id=None,
rule=None,
match_type=None,
matched_text=None,
score=0.0,
duration_ms=duration_ms
)
def _match_keywords(self, message: str, message_lower: str, rule: IntentRule) -> IntentMatchResult | None:
"""关键词匹配(保留现有逻辑)"""
pass
def _match_patterns(self, message: str, rule: IntentRule) -> IntentMatchResult | None:
"""正则匹配(保留现有逻辑)"""
pass
```
### 3.3 SemanticMatcher新增
```python
# 位置: app/services/intent/semantic_matcher.py
import asyncio
from typing import Any
import numpy as np
class SemanticMatcher:
"""语义匹配器"""
def __init__(
self,
embedding_provider: EmbeddingProvider,
config: FusionConfig
):
self._embedding_provider = embedding_provider
self._config = config
async def match(
self,
message: str,
rules: list[IntentRule],
tenant_id: str,
top_k: int = 3
) -> SemanticMatchResult:
"""
向量语义匹配
匹配模式:
- 模式 A: 使用规则预置的 intent_vector 直接计算相似度
- 模式 B: 使用规则的 semantic_examples 动态计算平均相似度
Args:
message: 用户消息
rules: 规则列表
tenant_id: 租户 ID
top_k: 返回候选数
Returns:
SemanticMatchResult: 匹配结果
"""
start_time = time.time()
if not self._config.semantic_matcher_enabled:
return SemanticMatchResult(
candidates=[],
top_score=0.0,
duration_ms=0,
skipped=True,
skip_reason="disabled"
)
rules_with_semantic = [r for r in rules if self._has_semantic_config(r)]
if not rules_with_semantic:
duration_ms = int((time.time() - start_time) * 1000)
return SemanticMatchResult(
candidates=[],
top_score=0.0,
duration_ms=duration_ms,
skipped=True,
skip_reason="no_semantic_config"
)
try:
message_vector = await asyncio.wait_for(
self._embedding_provider.embed(message),
timeout=self._config.semantic_matcher_timeout_ms / 1000
)
except asyncio.TimeoutError:
duration_ms = int((time.time() - start_time) * 1000)
return SemanticMatchResult(
candidates=[],
top_score=0.0,
duration_ms=duration_ms,
skipped=True,
skip_reason="embedding_timeout"
)
except Exception as e:
duration_ms = int((time.time() - start_time) * 1000)
return SemanticMatchResult(
candidates=[],
top_score=0.0,
duration_ms=duration_ms,
skipped=True,
skip_reason=f"embedding_error: {str(e)}"
)
candidates = []
for rule in rules_with_semantic:
score = await self._calculate_similarity(message_vector, rule)
if score > 0:
candidates.append(SemanticCandidate(rule=rule, score=score))
candidates.sort(key=lambda x: x.score, reverse=True)
candidates = candidates[:top_k]
duration_ms = int((time.time() - start_time) * 1000)
return SemanticMatchResult(
candidates=candidates,
top_score=candidates[0].score if candidates else 0.0,
duration_ms=duration_ms,
skipped=False,
skip_reason=None
)
def _has_semantic_config(self, rule: IntentRule) -> bool:
"""检查规则是否有语义配置"""
return bool(rule.intent_vector) or bool(rule.semantic_examples)
async def _calculate_similarity(self, message_vector: list[float], rule: IntentRule) -> float:
"""计算相似度"""
if rule.intent_vector:
return self._cosine_similarity(message_vector, rule.intent_vector)
elif rule.semantic_examples:
example_vectors = await self._embedding_provider.embed_batch(rule.semantic_examples)
similarities = [
self._cosine_similarity(message_vector, v)
for v in example_vectors
]
return max(similarities) if similarities else 0.0
return 0.0
def _cosine_similarity(self, v1: list[float], v2: list[float]) -> float:
"""计算余弦相似度"""
v1_arr = np.array(v1)
v2_arr = np.array(v2)
return float(np.dot(v1_arr, v2_arr) / (np.linalg.norm(v1_arr) * np.linalg.norm(v2_arr)))
```
### 3.4 LlmJudge新增
```python
# 位置: app/services/intent/llm_judge.py
class LlmJudge:
"""LLM 仲裁器"""
JUDGE_PROMPT = """你是一个意图识别仲裁器。根据用户消息和候选意图,判断最匹配的意图。
用户消息:{message}
候选意图:
{candidates}
请返回 JSON 格式:
{{
"intent_id": "最匹配的意图ID",
"intent_name": "意图名称",
"confidence": 0.0-1.0之间的置信度,
"reasoning": "判断理由"
}}
"""
def __init__(
self,
llm_client: LLMClient,
config: FusionConfig
):
self._llm_client = llm_client
self._config = config
def should_trigger(
self,
rule_result: RuleMatchResult,
semantic_result: SemanticMatchResult,
config: FusionConfig
) -> tuple[bool, str]:
"""
判断是否触发 LLM Judge
触发条件:
1. 冲突场景RuleMatcher 与 SemanticMatcher 命中不同意图
2. 灰区场景:最高置信度在灰区范围内
3. 多意图场景:多个候选意图置信度接近
Args:
rule_result: 规则匹配结果
semantic_result: 语义匹配结果
config: 融合配置
Returns:
(是否触发, 触发原因)
"""
if not config.llm_judge_enabled:
return False, "disabled"
rule_score = rule_result.score
semantic_score = semantic_result.top_score
if rule_score > 0 and semantic_score > 0:
if rule_result.rule_id != semantic_result.candidates[0].rule.id:
if abs(rule_score - semantic_score) < config.conflict_threshold:
return True, "rule_semantic_conflict"
max_score = max(rule_score, semantic_score)
if config.min_trigger_threshold < max_score < config.gray_zone_threshold:
return True, "gray_zone"
if len(semantic_result.candidates) >= 2:
top1_score = semantic_result.candidates[0].score
top2_score = semantic_result.candidates[1].score
if abs(top1_score - top2_score) < config.multi_intent_threshold:
return True, "multi_intent"
return False, ""
async def judge(
self,
input: LlmJudgeInput,
tenant_id: str
) -> LlmJudgeResult:
"""
LLM 仲裁
Args:
input: 仲裁输入
tenant_id: 租户 ID
Returns:
LlmJudgeResult: 仲裁结果
"""
start_time = time.time()
candidates_text = "\n".join([
f"- ID: {c['id']}, 名称: {c['name']}, 描述: {c.get('description', 'N/A')}"
for c in input.candidates
])
prompt = self.JUDGE_PROMPT.format(
message=input.message,
candidates=candidates_text
)
try:
response = await asyncio.wait_for(
self._llm_client.generate(
messages=[{"role": "user", "content": prompt}],
max_tokens=200,
temperature=0
),
timeout=self._config.llm_judge_timeout_ms / 1000
)
result = self._parse_response(response.content)
duration_ms = int((time.time() - start_time) * 1000)
return LlmJudgeResult(
intent_id=result.get("intent_id"),
intent_name=result.get("intent_name"),
score=result.get("confidence", 0.5),
reasoning=result.get("reasoning"),
duration_ms=duration_ms,
tokens_used=response.total_tokens,
triggered=True
)
except asyncio.TimeoutError:
duration_ms = int((time.time() - start_time) * 1000)
return LlmJudgeResult(
intent_id=None,
intent_name=None,
score=0.0,
reasoning="LLM timeout",
duration_ms=duration_ms,
tokens_used=0,
triggered=True
)
except Exception as e:
duration_ms = int((time.time() - start_time) * 1000)
return LlmJudgeResult(
intent_id=None,
intent_name=None,
score=0.0,
reasoning=f"LLM error: {str(e)}",
duration_ms=duration_ms,
tokens_used=0,
triggered=True
)
def _parse_response(self, content: str) -> dict:
"""解析 LLM 响应"""
import json
try:
return json.loads(content)
except json.JSONDecodeError:
return {}
```
### 3.5 FusionPolicy新增
```python
# 位置: app/services/intent/fusion_policy.py
class FusionPolicy:
"""融合决策策略"""
DECISION_PRIORITY = [
("rule_high_confidence", lambda r, s, l: r.score == 1.0 and r.rule is not None),
("llm_judge", lambda r, s, l: l.triggered and l.intent_id is not None),
("semantic_override", lambda r, s, l: r.score == 0 and s.top_score > 0.7),
("rule_semantic_agree", lambda r, s, l: r.score > 0 and s.top_score > 0.5 and r.rule_id == s.candidates[0].rule.id if s.candidates else False),
("semantic_fallback", lambda r, s, l: s.top_score > 0.5),
("rule_fallback", lambda r, s, l: r.score > 0),
("no_match", lambda r, s, l: True),
]
def __init__(self, config: FusionConfig):
self._config = config
def fuse(
self,
rule_result: RuleMatchResult,
semantic_result: SemanticMatchResult,
llm_result: LlmJudgeResult | None
) -> FusionResult:
"""
融合决策
Args:
rule_result: 规则匹配结果
semantic_result: 语义匹配结果
llm_result: LLM 仲裁结果(可能为 None
Returns:
FusionResult: 融合结果
"""
trace = RouteTrace(
rule_match={
"rule_id": str(rule_result.rule_id) if rule_result.rule_id else None,
"match_type": rule_result.match_type,
"matched_text": rule_result.matched_text,
"score": rule_result.score,
"duration_ms": rule_result.duration_ms
},
semantic_match={
"top_candidates": [
{"rule_id": str(c.rule.id), "name": c.rule.name, "score": c.score}
for c in semantic_result.candidates
],
"top_score": semantic_result.top_score,
"duration_ms": semantic_result.duration_ms,
"skipped": semantic_result.skipped,
"skip_reason": semantic_result.skip_reason
},
llm_judge={
"triggered": llm_result.triggered if llm_result else False,
"intent_id": llm_result.intent_id if llm_result else None,
"score": llm_result.score if llm_result else 0.0,
"duration_ms": llm_result.duration_ms if llm_result else 0,
"tokens_used": llm_result.tokens_used if llm_result else 0
},
fusion={}
)
final_intent = None
final_confidence = 0.0
decision_reason = "no_match"
for reason, condition in self.DECISION_PRIORITY:
if condition(rule_result, semantic_result, llm_result or LlmJudgeResult.empty()):
decision_reason = reason
break
if decision_reason == "rule_high_confidence":
final_intent = rule_result.rule
final_confidence = 1.0
elif decision_reason == "llm_judge" and llm_result:
final_intent = self._find_rule_by_id(llm_result.intent_id, rule_result, semantic_result)
final_confidence = llm_result.score
elif decision_reason == "semantic_override":
final_intent = semantic_result.candidates[0].rule
final_confidence = semantic_result.top_score
elif decision_reason == "rule_semantic_agree":
final_intent = rule_result.rule
final_confidence = self._calculate_weighted_confidence(rule_result, semantic_result, llm_result)
elif decision_reason == "semantic_fallback":
final_intent = semantic_result.candidates[0].rule
final_confidence = semantic_result.top_score
elif decision_reason == "rule_fallback":
final_intent = rule_result.rule
final_confidence = rule_result.score
need_clarify = final_confidence < self._config.clarify_threshold
clarify_candidates = None
if need_clarify and len(semantic_result.candidates) > 1:
clarify_candidates = [c.rule for c in semantic_result.candidates[:3]]
trace.fusion = {
"weights": {
"w_rule": self._config.w_rule,
"w_semantic": self._config.w_semantic,
"w_llm": self._config.w_llm
},
"final_confidence": final_confidence,
"decision_reason": decision_reason
}
return FusionResult(
final_intent=final_intent,
final_confidence=final_confidence,
decision_reason=decision_reason,
need_clarify=need_clarify,
clarify_candidates=clarify_candidates,
trace=trace
)
def _calculate_weighted_confidence(
self,
rule_result: RuleMatchResult,
semantic_result: SemanticMatchResult,
llm_result: LlmJudgeResult | None
) -> float:
"""计算加权置信度"""
rule_score = rule_result.score
semantic_score = semantic_result.top_score if not semantic_result.skipped else 0.0
llm_score = llm_result.score if llm_result and llm_result.triggered else 0.0
total_weight = self._config.w_rule + self._config.w_semantic
if llm_result and llm_result.triggered:
total_weight += self._config.w_llm
confidence = (
self._config.w_rule * rule_score +
self._config.w_semantic * semantic_score +
self._config.w_llm * llm_score
) / total_weight
return min(1.0, max(0.0, confidence))
def _find_rule_by_id(
self,
intent_id: str | None,
rule_result: RuleMatchResult,
semantic_result: SemanticMatchResult
) -> IntentRule | None:
"""根据 ID 查找规则"""
if not intent_id:
return None
if rule_result.rule_id and str(rule_result.rule_id) == intent_id:
return rule_result.rule
for candidate in semantic_result.candidates:
if str(candidate.rule.id) == intent_id:
return candidate.rule
return None
```
### 3.6 IntentRouter 升级
```python
# 位置: app/services/intent/router.py
class IntentRouter:
"""意图路由器(升级版)"""
def __init__(
self,
rule_matcher: RuleMatcher,
semantic_matcher: SemanticMatcher,
llm_judge: LlmJudge,
fusion_policy: FusionPolicy,
config: FusionConfig | None = None
):
self._rule_matcher = rule_matcher
self._semantic_matcher = semantic_matcher
self._llm_judge = llm_judge
self._fusion_policy = fusion_policy
self._config = config or FusionConfig()
async def match_hybrid(
self,
message: str,
rules: list[IntentRule],
tenant_id: str,
config: FusionConfig | None = None
) -> FusionResult:
"""
混合路由入口
流程:
1. 并行执行 RuleMatcher + SemanticMatcher
2. 判断是否触发 LlmJudge
3. 执行 FusionPolicy
4. 返回融合结果
Args:
message: 用户消息
rules: 规则列表
tenant_id: 租户 ID
config: 融合配置(可选,覆盖默认配置)
Returns:
FusionResult: 融合结果
"""
effective_config = config or self._config
rule_result, semantic_result = await asyncio.gather(
asyncio.to_thread(self._rule_matcher.match, message, rules),
self._semantic_matcher.match(message, rules, tenant_id)
)
llm_result = None
should_trigger, trigger_reason = self._llm_judge.should_trigger(
rule_result, semantic_result, effective_config
)
if should_trigger:
candidates = self._build_llm_candidates(rule_result, semantic_result)
llm_result = await self._llm_judge.judge(
LlmJudgeInput(
message=message,
candidates=candidates,
conflict_type=trigger_reason
),
tenant_id
)
fusion_result = self._fusion_policy.fuse(
rule_result, semantic_result, llm_result
)
return fusion_result
def match(self, message: str, rules: list[IntentRule]) -> IntentMatchResult | None:
"""
原有方法保留,向后兼容
Args:
message: 用户消息
rules: 规则列表
Returns:
IntentMatchResult | None: 匹配结果
"""
result = self._rule_matcher.match(message, rules)
if result.rule:
return IntentMatchResult(
rule=result.rule,
match_type=result.match_type,
matched=result.matched_text
)
return None
def _build_llm_candidates(
self,
rule_result: RuleMatchResult,
semantic_result: SemanticMatchResult
) -> list[dict]:
"""构建 LLM 候选列表"""
candidates = []
if rule_result.rule:
candidates.append({
"id": str(rule_result.rule_id),
"name": rule_result.rule.name,
"description": f"匹配方式: {rule_result.match_type}, 匹配内容: {rule_result.matched_text}"
})
for candidate in semantic_result.candidates[:3]:
if not any(c["id"] == str(candidate.rule.id) for c in candidates):
candidates.append({
"id": str(candidate.rule.id),
"name": candidate.rule.name,
"description": f"语义相似度: {candidate.score:.2f}"
})
return candidates
```
---
## 4. 融合公式与默认阈值
### 4.1 融合公式
```
final_confidence = (w_rule * rule_score + w_semantic * semantic_score + w_llm * llm_score) / total_weight
```
其中:
- `total_weight = w_rule + w_semantic + (w_llm if llm_triggered else 0)`
- 结果限制在 `[0.0, 1.0]` 范围内
### 4.2 默认阈值配置
```python
DEFAULT_FUSION_CONFIG = FusionConfig(
w_rule=0.5, # 规则权重
w_semantic=0.3, # 语义权重
w_llm=0.2, # LLM 权重
semantic_threshold=0.7, # 语义匹配高置信阈值
conflict_threshold=0.2, # 冲突判定阈值(置信度差值)
gray_zone_threshold=0.6, # 灰区上限阈值
min_trigger_threshold=0.3, # 灰区下限阈值
clarify_threshold=0.4, # 澄清触发阈值
multi_intent_threshold=0.15, # 多意图判定阈值
llm_judge_enabled=True, # 启用 LLM 仲裁
semantic_matcher_enabled=True, # 启用语义匹配
)
```
### 4.3 决策优先级
| 优先级 | 决策原因 | 条件 |
|--------|----------|------|
| 1 | rule_high_confidence | RuleMatcher 命中且 score=1.0 |
| 2 | llm_judge | LlmJudge 触发且返回有效意图 |
| 3 | semantic_override | RuleMatcher 未命中但 SemanticMatcher 高置信 |
| 4 | rule_semantic_agree | 规则与语义匹配同一意图 |
| 5 | semantic_fallback | SemanticMatcher 中等置信 |
| 6 | rule_fallback | 仅规则匹配 |
| 7 | no_match | 三路均低置信 |
---
## 5. 异常处理策略
| 异常场景 | 处理策略 | 可观测性 |
|----------|----------|----------|
| Embedding 调用失败 | 跳过 SemanticMatcher仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="embedding_failed"` |
| Embedding 超时 | 跳过 SemanticMatcher仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="embedding_timeout"` |
| Qdrant 检索失败 | 跳过 SemanticMatcher仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="qdrant_error"` |
| LLM 调用失败 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="LLM error: ..."` |
| LLM 超时 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="LLM timeout"` |
| LLM 响应解析失败 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="parse_failed"` |
| 规则缓存失效 | 重新从数据库加载 | `rule_match.cache_miss=true` |
| 配置缺失 | 使用默认配置 | `fusion.using_default_config=true` |
---
## 6. 数据模型变更
### 6.1 IntentRule 实体扩展
```python
# 位置: app/models/entities.py
class IntentRule(SQLModel, table=True):
# ... 现有字段 ...
intent_vector: list[float] | None = Field(
default=None,
sa_column=Column(JSONB, nullable=True, comment="意图向量(预计算)")
)
semantic_examples: list[str] | None = Field(
default=None,
sa_column=Column(JSONB, nullable=True, comment="语义示例句列表")
)
```
### 6.2 ChatMessage 实体扩展
```python
# 位置: app/models/entities.py
class ChatMessage(SQLModel, table=True):
# ... 现有字段 ...
route_trace: dict[str, Any] | None = Field(
default=None,
sa_column=Column(JSONB, nullable=True, comment="意图路由追踪日志")
)
```
---
## 7. API 设计
### 7.1 融合配置 API
```
GET /admin/intent-rules/fusion-config
PUT /admin/intent-rules/fusion-config
```
### 7.2 意图向量生成 API
```
POST /admin/intent-rules/{id}/generate-vector
```
### 7.3 监控 API 扩展
```
GET /admin/monitoring/conversations/{id}
```
响应新增 `route_trace` 字段。
---
## 8. 性能优化
### 8.1 并行执行
RuleMatcher 和 SemanticMatcher 使用 `asyncio.gather` 并行执行。
### 8.2 超时控制
- SemanticMatcher 超时100ms
- LlmJudge 超时2000ms
### 8.3 缓存策略
- 规则缓存:复用现有 RuleCacheTTL=60s
- 向量缓存:可选,后续迭代考虑
---
## 9. 风险与待澄清
### 9.1 风险
| 风险 | 等级 | 缓解措施 |
|------|------|----------|
| Embedding 调用增加延迟 | 中 | 设置超时,超时跳过 |
| LLM Judge 频繁触发增加成本 | 中 | 配置合理触发阈值 |
| 语义向量配置复杂度高 | 低 | 提供自动生成 API |
### 9.2 待澄清
- 意图向量索引是否需要独立的 Qdrant Collection
- LlmJudge 的 Token 消耗是否需要单独计费统计?
- 融合配置是否需要支持租户级差异化配置?