Writing Caveat — 写作时不要过度声张的点

本文档记录在讨论中出现过、但经不起 reviewer 推敲的 framing。写 grand proposal、paper、rebuttal 时避免使用这些论点作为主防线。

Caveat 1: 不要把 interaction preference 和 item preference 做 ontological 区分

脆弱的说法

“Item preference 是关于内容的事实（WHAT），interaction preference 是关于交互形态的元层信息（HOW）。两者是不同范畴。“

为什么站不住

从 memory system 实现层看，两者都是：

存储的事实（“user likes pizza” / “user prefers terse responses”）
被系统用来 modulate 未来行为的条件

都是 “meta-level information that modulates memory use”。HOW/WHAT 的二分是认知科学味的修辞，不是系统层的结构差异。Reviewer 会一句话打回。

可以说的替代 framing

Operational distinction（不是 ontological）：这两类 preference 在测量上可以独立隔离。不声称本体论上不同。
Structural regularity（见 Caveat 3）：interaction preference 有一个小而稳定的 taxonomy（15 × 4），item preference 组合无界。这个差异是方法论的，不是本体论的。

Caveat 2: 不要声称需要新的 memory architecture

脆弱的说法

“Interaction preference 需要独立的 memory architecture。“

为什么站不住

我们不造 memory 系统。我们是 benchmark 作者，测试的是现有 memory 系统（mem0, MemOS, Zep, file-baseline）处理 interaction preference 的能力。

正确 framing

“我们提供 benchmark，用以评估现有 memory architectures 在 interaction preference learning 上的表现差异。benchmark 的贡献是评估方法论，不是新系统。”

这个边界一定要在 abstract 和 intro 第一段就说清楚——否则会被误读成”作者提出新 memory 系统”然后按那个预期打分，非常吃亏。

Caveat 3: Item preference 也可以有 context-dependency / evolution / implicit signals

脆弱的说法

“Context-dependency、evolution、implicit signal 这三个结构属性是 interaction preference 独有的。“

为什么站不住

Reviewer 会立刻给反例：

Context-dependency：同一个人可能在家偏好披萨、商务午餐偏好寿司。
Evolution：口味会变，品牌忠诚度会漂移。
Implicit signals：推荐系统几十年都在从行为推断 item preference。

所以不是”独有”。

可以说的替代 framing

差异是程度 + 可测性，不是有无：

维度	Interaction preference	Item preference
Taxonomy 大小	固定 ~15 attributes × 4 contexts	组合无界（所有产品/人/主题）
单次交互的信号密度	每次交互都激活（tone, verbosity, confirmation 都会暴露）	稀疏（一条消息很少暴露披萨偏好）
Ground truth 可构造性	可以从角色剧本归因到 15×4 矩阵	通常需要大量行为数据统计推断

核心句：interaction preference 的结构规则性使得 benchmark 设计成为可能；item preference 的组合爆炸使得等价 benchmark 难以构造。这是方法论上的差异，不是现象上的独有。

Caveat 4: 不要用 “bottleneck / 真实痛点” 作为主论据

脆弱的说法

“Factual recall 基本被解决了，interaction preference 是 PA 部署的真实瓶颈。“

为什么站不住

这是断言，没有引用的用户研究数据支撑。Reviewer 会要 citation，我们目前给不出硬证据。

可以说的替代

如果要引用用户不满 memory 的 anecdote，必须有 citation（调研报告 / 产品 post-mortem / HCI 论文）。
在 introduction 的 motivation 段，用受限的表述：

“Beyond factual recall, personalization also depends on behavioral adaptation [cite HCI paper]. Yet behavioral-adaptation evaluation remains under-specified in current memory benchmarks.”
不要用 “the real bottleneck” 这种绝对化表述，除非手上有数据。

Caveat 5: 不要声称我们的 benchmark 是 “the” preference benchmark

脆弱的说法

“我们 benchmark 涵盖了 PA 的 preference learning 全貌。“

为什么站不住

我们显式只测 interaction preference。content/item preference 我们不测。这是 scope choice，不是全覆盖。

正确 framing

“We benchmark interaction preference learning specifically, as a complement to existing memory benchmarks that focus on factual recall and item-level personalization.”

Frame 成 complementary, 不是 replacement。

可以作为主防线的核心论点（按强度）

经过以上 caveats 过滤后，真正站得住的防线只有这两条。写作时应当把它们作为 primary argument：

主论点 A（方法论必要性）

Interaction preference 具有罕见的结构规则性——可以被 15 attributes × 4 contexts 的矩阵刻画，且每个 cell 的 ground truth 可以从剧本 / 角色 canon 构造。这使得构造一个具有可复现 ground truth 的 evaluation 矩阵成为可能。对于 item preference，等价 benchmark 因组合无界而难以构造。

→ 这不是在声称 interaction preference 更重要，而是在声称它恰好是 preference 学习中 tractable 的那一块。 先把 tractable 的部分搞扎实，是合理的科学切分。

主论点 B（现有 benchmark 的 gap）

现有 memory / personalization benchmark（LOCOMO, LongMemEval, PrefEval, MemoryBench）关注factual recall 或 content recommendation accuracy。没有一个系统地测量 memory 系统在behavioral adaptation across contexts and over time 上的能力。这是一个空白。

→ 这个论点不依赖”更重要”，只依赖”没被测过”。更安全。

主论点 C（机制型论证 / 从模型学习的视角）

核心 claim：现有 memory 架构是为 “retrieve facts → inject context → let base model use” 的 paradigm 设计的。这个 paradigm 对 content preference 足够；对 interaction preference 存在结构性失配，体现在以下三个层次。

这条论点比 A、B 更有技术穿透力，因为它直接说到 memory 架构和 base model 的接口。写作时可作为 intro 的第二段（第一段划 gap = 主论点 B，第二段给机制解释 = 主论点 C）。

C.1 Prior Interference Asymmetry

Claim: Base model 对 interaction-related behavior（verbosity, confirmation, hedging, autonomy）已经有强 policy prior。User-specific interaction preference 要覆写这个 prior；content preference 无此 prior，只需添加。

Benchmark prediction: 同一 memory 架构在”显式召回 preference”上可能成功，但在”行为上实际应用 preference”时失败——retrieval 得到的信息被 base prior 压过。设计 paired probe（显式问 vs. 行为验证）可实证这个 gap。

⚠️ 写作时谨慎:

不要声称 prior 的确切来源（RLHF vs. pretraining 中的 “helpful assistant” 语料分布 vs. instruction tuning vs. safety alignment）。归因到具体训练阶段需要 mechanistic interpretability 证据我们不会做。
中性表述：统一用 “behavioral prior induced during training” 或 “base-model default interaction policy”——覆盖所有可能来源，不站队。
特别注意：Base model（pre-RLHF）可能也表现类似 prior，因为预训练语料里 “helpful assistant” style 本身就有分布。这意味着问题不能靠”去掉 RLHF”绕开——但也意味着我们不能把矛头单独指向 RLHF，否则会被 reviewer 指出反例。
Reviewer 会问：“为什么不直接在 user-specific data 上 SFT 微调底座？” 答：(a) benchmark 目标是评估 memory 系统的 alignment 能力，SFT 是正交方案；(b) PA 场景下无法为每个用户微调底座。写作时预先 address 这个问题。

C.2 Retrieval Granularity vs. Generation Granularity

Claim: Content preference 稀疏激活（topic-triggered），与现有 query-triggered retrieval 接口 match。Interaction preference 每一次生成都激活（verbosity/tone 没有”不涉及”的时刻），不对应任何明显的 retrieval query。现有 retrieval 接口与 interaction preference 的使用模式结构性不 match。

Benchmark prediction: 不同 retrieval 机制的 memory 架构会在 CSS / ETS 上呈现系统性分化——always-on summary layer（MEMORY.md 式、MemGPT 式）在 interaction preference 上的稳定度应高于纯 flat vector store。

⚠️ 写作时谨慎:

边界要划清楚”现有架构”指什么。新架构（Letta, LangMem, mem0 最新版）已经混合了 always-on summary + retrieval。不要 strawman。
不要声称 retrieval 根本不 work。System prompt 注入 + 选择性 retrieval 组合是有效的——benchmark 是测哪种组合在 interaction preference 上更 work，不是否定 retrieval paradigm。
表述应为 “retrieval-only paradigm 与 interaction preference 的密集激活模式不 match”，而非 “retrieval 对 interaction preference 无用”。

C.3 Reflexive Reference

Claim: Interaction preference 是关于模型自身输出行为的约束（“user wants me to be terse”），不是关于世界的事实。应用它需要模型对自己生成分布有某种隐式自我建模。Content preference 无此反身结构。

Benchmark prediction: 具备显式 reasoning trace / reflection 能力的模型（thinking-mode Claude, o1/o3）在 interaction preference 应用上应获得不成比例的提升——因为可以显式 self-correct。这应表现为 benchmark 上 “模型能力 × memory 架构” 的交互项显著。

⚠️ 写作时谨慎——这是三条里最容易 overreach 的:

“Model 需要自我建模” 在 mechanistic interpretability 里是开放问题。不要声称架构必须有 reflexive 模块。
退守表述：不说 “需要”，说 “可能受益于”；不说 “needs reflexive self-model”，说 “reasoning-capable models may better self-correct against interaction-preference targets”。
这条论点的去留依赖经验结果：如果 benchmark 显示 reasoning model 没有显著优势，这条应直接退出核心 defense，只保留在 discussion 作为 follow-up question。写作顺序规划：C.3 的最终定稿等实验结果出来再写，不要提前落笔。

需要补充研究的内容

为了让这两条主论点硬起来，还需要：

Related work 里精确比较表：列出每个 memory / preference benchmark 测的是什么，让 gap 可视化。不要让 reviewer 自己去推断。
Taxonomy 稳定性的实证支撑：15×4 不能是”我们觉得这样分合理”，要有文献 grounding（PrefIx、HCI literature、我们自己的 pattern-to-preference rubrics）。
至少一个初步结果证明现有 memory 系统在我们 benchmark 上区分度显著。否则”测不测得出来”都存疑。

下一步讨论

写这份 caveat 是为了在 framing 层让我们自己先诚实。接下来要决定：

Grand proposal 的 intro 要不要重写——从现在的 “why it matters” 型论证，改成 “why it’s tractable + what’s missing” 型论证？
论点 A 里”结构规则性”需要什么程度的实证 backing 才够硬？
如何在 abstract / intro 前 200 字内就划清”我们不造 memory 系统”的边界？
C.1 prior 的 source 写作上统一用 “training-induced behavioral prior” 这类中性表述；要不要引用几篇 mechanistic interpretability 的工作（关于 RLHF 对行为分布的影响）给这条论点做 soft backing？
C.1 paired probe（explicit-recall vs. behavioral-apply）是否写进 data/test_interactions/ schema 作为必需字段——这会改变数据设计。
C.3 的实验必要条件（reasoning model × non-reasoning model 配对）应在实验矩阵里显式设计；若结果无显著交互，C.3 从核心 defense 降到 discussion。

MemPA Wiki

Explorer

Writing Caveat — Claims to NOT Make

Writing Caveat — 写作时不要过度声张的点

Caveat 1: 不要把 interaction preference 和 item preference 做 ontological 区分

脆弱的说法

为什么站不住

可以说的替代 framing

Caveat 2: 不要声称需要新的 memory architecture

脆弱的说法

为什么站不住

正确 framing

Caveat 3: Item preference 也可以有 context-dependency / evolution / implicit signals

脆弱的说法

为什么站不住

可以说的替代 framing

Caveat 4: 不要用 “bottleneck / 真实痛点” 作为主论据

脆弱的说法

为什么站不住

可以说的替代

Caveat 5: 不要声称我们的 benchmark 是 “the” preference benchmark

脆弱的说法

为什么站不住

正确 framing

可以作为主防线的核心论点（按强度）

主论点 A（方法论必要性）

主论点 B（现有 benchmark 的 gap）

主论点 C（机制型论证 / 从模型学习的视角）

C.1 Prior Interference Asymmetry

C.2 Retrieval Granularity vs. Generation Granularity

C.3 Reflexive Reference

需要补充研究的内容

下一步讨论

Graph View

Table of Contents

MemPA Wiki

Explorer

Writing Caveat — Claims to NOT Make

Writing Caveat — 写作时不要过度声张的点

Caveat 1: 不要把 interaction preference 和 item preference 做 ontological 区分

脆弱的说法

为什么站不住

可以说的替代 framing

Caveat 2: 不要声称需要 新的 memory architecture

脆弱的说法

为什么站不住

正确 framing

Caveat 3: Item preference 也可以有 context-dependency / evolution / implicit signals

脆弱的说法

为什么站不住

可以说的替代 framing

Caveat 4: 不要用 “bottleneck / 真实痛点” 作为主论据

脆弱的说法

为什么站不住

可以说的替代

Caveat 5: 不要声称我们的 benchmark 是 “the” preference benchmark

脆弱的说法

为什么站不住

正确 framing

可以作为主防线的核心论点（按强度）

主论点 A（方法论必要性）

主论点 B（现有 benchmark 的 gap）

主论点 C（机制型论证 / 从模型学习的视角）

C.1 Prior Interference Asymmetry

C.2 Retrieval Granularity vs. Generation Granularity

C.3 Reflexive Reference

需要补充研究的内容

下一步讨论

Graph View

Table of Contents

Caveat 2: 不要声称需要新的 memory architecture